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^Sj p Abstract 



A method to generate reactive trajectories, namely equilibrium trajec- 
i-G ' tories leaving a metastable state and ending in another one is proposed. 

The algorithm is based on simulating in parallel many copies of the system, 
r^ ' and selecting the replicas which have reached the highest values along a 

chosen one-dimensional reaction coordinate. This reaction coordinate does 

not need to precisely describe all the metastabilities of the system for the 

method to give reliable results. An extension of the algorithm to compute 

transition times from one metastable state to another one is also presented. 

We demonstrate the interest of the method on two simple cases: a one- 

C^ dimensional two-well potential and a two-dimensional potential exhibiting 

I , two channels to pass from one metastable state to another one. 

-a 
a. 

1 Introduction 

A very challenging problem in molecular dynamics is to compute reactive paths, 
namely trajectories of the system leaving a given metastable state (say A, an 
open subset of R ), and ending in another one (say B, another open subset 
of M. d such that A (~) B = 0), without going back to A. The difficulty comes 

■^4- ■ from the fact that a dynamics at equilibrium typically remains for a very long 

time around a metastable state before hoping to another one. In other words, 

^^ ' most of the trajectories leaving A will go back to A, rather than reaching B. 

There exist many methods to sample the canonical equilibrium measure in 
such a situation, and compute equilibrium thermodynamics quantities like for 
example the free energy (see [H [20] and references therein) but it is much more 

^ ■ difficult to compute dynamical quantities at equilibrium along reactive paths, 

like transport coefficients and transition rates. 



*INRIA Rennes - Bretagne Atlantique, Campus de Beaulieu, 35042 Rennes Cedex, France. 
Frederic . Cerou@inria.fr 

^INRIA Rennes - Bretagne Atlantique and Universite de Haute Bretagne, Place du Recteur 
H. Le Moal, CS 24307, 35043 Rennes Cedex, France, arnaud.guyader@uhb.fr 

*CERMICS, Ecole des Ponts ParisTech, 6-8 avenue Blaise Pascal, 77455 Marne La Vallee, 
France, tony.lelievre@cermics.encp.fr 

§ CERMICS, Ecole des Ponts ParisTech, 6-8 avenue Blaise Pascal, 77455 Marne La Vallee, 
France, david.pommier@cermics.encp.fr 



In such situations, it is very often the case that a reaction coordinate is 
known, which in some sense, indexes transitions from A to B. In this paper, a 
reaction coordinate is meant to be a smooth one-dimensional function: 

f : R d ->• R 

such that: 

|V£| ^0, Ac{x£ R d , £(x) < z min } and B C {x € R d , £(x) > z max }, (1) 

where z m \ a < z max are two given real numbers. Let us denote 

Z z = {xeR d ,£(x)=z} 

the submanifold of configurations at a fixed value z of the reaction coordinate. 
For the algorithm we propose to give reliable results, one needs S^ min (resp. 
S Zmax ) to be "sufficiently close" to A (resp. B). More precisely, we require that 
most trajectories starting from the submanifold S 2min (resp. S 2max ) ends in A 
(resp. in B). 

The algorithm we propose in this paper is inspired from methods used in 
statistics to deal with rare events simulations and estimations. This approach, 
known as multilevel splitting, dates back to Kahn and Harris [18] and Rosen- 
bluth and Rosenbluth [26]. All the variants share the same basic idea, that 
is discarding the failed trajectories, and splitting (or branching) the trajecto- 
ries approaching the rare set. We refer the reader to |14| for a review of the 
multilevel splitting method and a list of references. We will mainly focus here 
on adaptive variants of this method which have been recently proposed in this 
domain [31 [15] . 

In our context, the idea is to perform an iterative process on many replicas 
of trajectories which start from the metastable region A, and end either in A 
or in B, and to kill progressively the trajectories which have not reached high 
values along £. At the end, an equilibrium ensemble of trajectories starting 
from A and ending in B are obtained, with a bias which scales like 0(1/N), 
N being the number of trajectories. This scaling of the bias is classical for 
Monte Carlo methods based on interacting replicas [6] , and is typically negligible 
compared to the statistical noise which scales like 0(\/yN). Compared to a 
brute force algorithm, the computational cost is typically reduced by a factor 
1 000 (see Section [3] for more details) . The details of the algorithm are provided 
in Section [2j 

One of the differences between the algorithm we propose and the transition 
interface sampling method |29t |2"8] , the forward flux sampling method [2J [T] , or 
the milestoning method \i2\ [22] whose aim is also to compute reactive trajec- 
tories through paths ensembles, is that we do not need to decide a priori of a 
given discrete set of values z m \ n = zq < z\ < z<± < ■ ■ ■ < z n = z max through 
which the trajectories (more precisely, the reaction coordinate along the trajec- 
tories) will go. In some sense, these are adaptively selected by the algorithm, 
with typically fine discretizations in regions with high gradients of the potential 
energy, before saddle points, and coarser discretizations in flat regions. Other 



techniques to sample reactive trajectories include the string method pTTT [9l [TO] . 
see also the review paper [7]. 

The main interests of the algorithm we propose are: (i) It does not require 
fine tuning of any numerical parameter, nor a priori discretization of the re- 
action coordinate values; (ii) It can be applied to any Markovian stochastic 
dynamics (overdamped Langevin, Langevin, Hybrid Monte Carlo, etc.); (hi) 
It is reliable whatever the chosen reaction coordinate satisfying (prj, and in 
particular if the reaction coordinate does not describe all the metastabilities. 
This is for example the case if, conditionally to a given value of £, the canonical 
measure is multimodal (or, equivalently, the potential energy exhibits wells sep- 
arated by high barriers along some submanifolds T, z ). This is actually a generic 
situation in practice, encountered in particular when multiple pathways link the 
two metastable states A and B (see Section 13.21 for a numerical illustration) . 

The article is organized as follows. Section [2] describes the algorithm to 
compute reactive paths. Section [3] illustrates the efficiency of the algorithm 
on two prototypical cases. Section H] proposes an extension of the approach to 
evaluate mean transition times, which is illustrated in Section [5] by some nu- 
merical applications. Finally, conclusions and possible extensions are discussed 
in Section [6l 

2 Computing reactive trajectories: the algorithm 

2.1 Reactive trajectories 

Let V : R — > R denote the potential function, and let us consider, to fix ideas, 
overdamped Langevin dynamics: 



dX t = -VV(X t ) dt + V^/FWi, (2) 

where /3 = l/(/c^T). As mentioned above and as will become clear below, the 
algorithm actually applies to any (continuous in time) Markovian stochastic 
dynamics, like Langevin dynamics, or Hybrid Monte Carlo [8] for example. 
The equilibrium canonical measure is 

dfi = Z~ exp(— /3V(x)) dx 

where Z = / exp(— /3V(x)) dx. The equilibrium trajectories are those ob- 

Jm d 
tained with initial conditions Xq distributed according to //, and which sat- 
isfy ([2]). We assume that A and B are two metastable regions for the dynam- 
ics ([2]), namely a trajectory starting from A (resp. B) remains for a long time 
in a neighborhood of A (resp. B). A reactive trajectory (from A to B) [171 124] 
is a portion of an equilibrium trajectory which links A to B. Thus, it is a 
trajectory which leaves A, does not come back to A, and reaches B. It is in 
general difficult to generate such trajectories, and we propose an algorithm to 
build an ensemble of N reactive trajectories, using a reaction coordinate £ which 
satisfies (HJ. In the numerical experiments below, we will test various reaction 
coordinates, but, to fix ideas, one could think of £(x) = \\x — xa\\ where xa £ A 
denotes a reference configuration in A, and || • || is the Euclidean norm. 



2.2 Details of the algorithm 

The algorithm starts with an initialization procedure which consists in three 
steps (see the end of Section 14.21 for an alternative equivalent initialization 
method): 

1. Generate an ensemble of initial conditions (Xq )i< n <N distributed accord- 
ing to the canonical measure \i conditionally to being in A, namely: 

dfiA = %a exp(— (3V(x))1a(x) dx 

where Za = f A exp(—/3V(x))dx and 1^ denotes the characteristic func- 
tion of the region A. For this first step, we use a subsampling of a long 
trajectory satisfying ([2]), starting in A and remaining in A through a 
Metropolis-Hastings procedure. 

2. Starting from the initial conditions (^Q)i< n <jv, ^ or eacn n £ {!> • • • ;-^}j 
compute the trajectory (X")t>o satisfying ((21) driven by a Brownian mo- 
tion W", until the stopping time 

a n = mf{t>0,Z(X?)>z min }. 

The Brownian motions (W/ 1 ) are of course assumed to be independent. 
Only the end of the trajectory (X i n )o<i< (J «, between A and £ Zmin , is re- 
tained. With a small abuse of notation, let us denote the last time at 
which X™ leaves A (and a n accordingly), so that Xq G dA, X™„ G £z min 
and X™ does not touch either dA nor S 2min for t G (0, a n ). This second 
step (which can be seen as one iteration of a transition interface sampling 
procedure) is not computationally demanding since £ Zmin is assumed to 
be "close to" A so that a n is typically small. 

3. Continue the simulation of {Xf) t > a ^ (according to ([2]) with Wt = WJ 1 ) 
until the stopping time 

T n = inf{t > a n , X™ G A U B}. 

At the end of the initialization procedure, one has an ensemble of N equilib- 
rium trajectories (X™)o<t< r «, which leave A, end either in A (the most likely) 
or in B, conditionally to the fact that sup 4>0 £PQ n ) > z m \ n . Let us denote 
these trajectories (X t ' n ) (for n G {1, . . . , N}) and the associated stopping times 
T i,n _ r n_ Now the algorithm goes as follows (see Figure [T]for a schematic rep- 
resentation): Iterate on k > 1, 

4. Compute the largest reaction coordinate value attained for each path: 

z k ' n = sup £(X^ 



0<t<T k ' n 



5. Order the values (z ' n )i< n <N'- 

z k,e k (l) < z k,e k CZ) < ... < z k,e k (N) } 



where e k is a permutation over {1, . . . , N}. To simplify the notation, let 
us denote 

n k = e k (l) = argmin z k ' n , 

ne{l,...,N} 

the index which realizes the smallest value z ,e ' 1 ', and let us denote q 
the (empirical) quantile of order 1/N, namely this smallest value: 

k = z k,e\l) = z k,n* = min z k,n_ 
n£{l,...,N} 

L, k 

6. Kill the trajectory (X t ' n )n <t<T k, n k , and consider trajectories for iteration 
k + 1 as follows: 



• For all n ^ n , the n-th trajectory is unchanged: r + ,n = r '" and 

/ v k+l,n\ _ i v k,n\ 

K-A-t >0<t<T k + 1 - n — K-A-t J0<t<r fe .'M 

• Generate a new n fc -th trajectory in three steps: (i) Choose at ran- 
dom (with uniform law) i^ G {1, . . . , n — 1, n + 1, . . . , N}; (ii) Set 

x k+l,n k = x k,i k for ^ t G (0)(jfc) where 

a fc =inf{t>0,e(X^)>g fc }; 

(iii) Generate the end of the trajectory (X t ,n )t>a k according to ([2]) 
(with Wt = W™ k ) until the stopping time 

T k+l,n k = inf | f > aki x} +1 > n " EAUB}. 

7. Go back to 4 (with k being k + 1), until q k > z max - More precisely, the 
number of iterations is defined as 

fcmax = SUp{k >l,q k < Z max }. 

At iteration k of the algorithm, one obtains an ensemble of iV equilibrium 
trajectories (X t ,n )o<t<T fc ' n > which leave A, end either in A or in B, conditionally 
to the fact that sup 0<t<r fc,n i{X^) > q k . 

At the end of the algorithm, all the trajectories cross the submanifold £ fc max . 
Since /c max is the last iteration index for which the quantile q is smaller than 
z max and since S Zmax is assumed to be "close to" B, most of them end in B. 
The final step to retain only reactive trajectories is: 

8. We retain only the trajectories which indeed end in B to perform statistics 
on reactive trajectories. We denote r the proportion of such trajectories 
among the ones obtained at the final iteration & max . 

This last step does not introduce any additional bias since (recall that q fcmax < 

£-ma x J 

B C{x€ R d , $(x) > q km ^}. 

Note that it is very simple to adapt this algorithm to any stochastic Marko- 
vian dynamics, the new paths being generated at each iteration using indepen- 
dently drawn random numbers. 






Figure 1: Schematic representation of the algorithm, with N = 3 paths: on 
the left, the path which goes the less far in the reaction coordinate direction is 
killed, and, on the right, a new path is generated, starting from a previous one 
at the 1/N empirical quantile value. 



2.3 Discussion of the algorithm 

As mentioned above, at the end of the k-th. step of the algorithm (steps 4 to 7), 
one has an ensemble of N equilibrium trajectories, which start from A, end ei- 
ther in A or B, and are distributed conditionally to the event {sup 0<t<r £(Xt) > 
q k }. It can be shown (see [3]) that the bias scales like 0(1/N) and that the 
standard deviation scales like 0(l/yN). 



At the end of the day, note that 
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1 

N 



(3) 



gives an estimate of the probability p of "observing a reactive trajectory" , since 
at each iteration of the algorithm, only one in the N trajectories is killed. 
More precisely, this is an estimate of the probability, starting from S 2min with 
the equilibrium distribution generated by the initialization procedure (steps 1 
to 3 above), to observe a trajectory which touches B before A. This probability 
actually depends on the choice of £ Zmin , while the law of the reactive trajectories 
generated by the algorithm does not. 

There exist generalizations of the above algorithm (see [3]), where, instead of 
killing only 1 trajectory and working with the quantile of order fa, i trajectories 
are killed and the quantile of order fa is considered (where i G {1, . . . , N — 1}). 
We observe numerically (and this can be checked theoretically at least in some 
simple situations, see [15]) that i = 1 yields the best results. However, in 
those preliminary simulations, we did not use the fact that the algorithm with 
a quantile of order fa (and i > 1) enables to simulate the new i replicas in 
parallel, which may lead to an interesting computational gain in cases when 
simulating a new trajectory is computationally demanding. 

A difficult mostly open question is to compute the asymptotic variance of 
an estimator associated to a given observable over reactive trajectories (say 
the time length of the trajectory) and then to try to optimize this estimator 
with respect to the chosen reaction coordinate. It can be shown that in terms 
of the asymptotic variance of &n, the optimal reaction coordinate (commonly 



called importance function in the context of statistics, see [5]) is the so-called 
committor function q (see [171 111] ) which satisfies: 



-W • Vq + fi^Aq = in R d \ (A U B), 
9 = on dA and q = 1 on dB. 



(4) 



The function q can be interpreted in terms of the process Xf solution to ([2]) 
with initial condition Xq = x, as: 

q(x) = P (X? reaches B before A) . (5) 

We will check numerically below that the variance of the results seems to be 
smaller if £ is chosen close to q. But one interesting feature of the method is 
that it does not need to be the case to give reliable results in terms of reactive 
trajectories. 

2.4 Complexity of the algorithm 

The number of iterations /c max of course depends on the problem at hand. If the 
probability of reaching {x € M d ,£(x) > z max } starting at equilibrium from S^ min 
without touching A is denoted p, then it has been proved in [15] that the random 
variable £; max has a Poisson distribution, with mean E[£; max ] = — iVlog(p). In 
the numerical simulations below, the probability p ranges between 10~ 18 and 
10~ 4 depending on the situation considered. 

The initialization procedure (steps 1 to 3) is straightforward to parallelize. 
Then, note that at each iteration, only one new trajectory needs to be com- 
puted, until it reaches one of the two metastable states A or B. Since A 
and B are typically metastable sets that equilibrium trajectories visit very 
often, this should not be too costly. Moreover, inserting the new maximum 
value z +1,n within the set of former values is a very rapid procedure, since 
they were already ordered. In practice, we use a dichotomy recursive algo- 
rithm, by comparing z +l,n to the median value (quantile or order 1/2) of 
{z k,x , . . . , z k ' n _1 , z k ' n +1 , . . . , z k ' N }, and then to the quantile of order 1/4 or 
3/4 depending on its position with respect to the median value, etc. Then the 
insertion of the new particle at the right place in the ordered sample has also a 
cost in O(logiV) via a min-heap algorithm (see for example [H]). This yields 
an algorithm of complexity 0(log N) at each iteration. 

Concerning the cost of the algorithm in terms of memory, two remarks are 
in order. First, for trajectories which end in A rather than B, one can discard 
the part of the trajectory after it reaches its maximum value along £, since this 
piece is then never used in the algorithm. Second, at iteration k, one may only 
retain a coarse description of each trajectory up to the first time it reaches the 
quantile value q k along £. Indeed, in the forthcoming iterations, this part of the 
trajectory is not used anymore. By coarse description, we mean that one only 
needs to retain the features of that part of the trajectory that are needed to 
perform the required statistics over reactive trajectories. For example, if only 
the time length of the reactive trajectory is needed (see for example the results 
in Section [HD, the whole trajectory until this time may be discarded. In any 
case, one could think of storing only a subsampling of this part of the trajectory. 



3 Computing reactive trajectories: numerical illus- 
trations 

3.1 A one-dimensional case 

In this section, we consider a one-dimensional situation and overdamped Langevin 
trajectories ([2]), with V being the double- well potential: 

V(x) = x A - 2x 2 . 

This potential has two global minima at ±1 and one local maximum at 0. In 
this simple one dimensional setting, we set as metastable states A = {—1} and 
B = {+1}, and the reaction coordinate is taken to be simply 

£{x) = x. 

For the numerical experiments, we take z max = — z m { n = 0.9. The aim of this 
section is mainly to validate the algorithm by comparing the results to those 
obtained by direct numerical simulation (DNS), namely a simple Monte Carlo 
algorithm without any selection procedure, and then to have an idea of the 
computational gain. DNS can only be used for small values of /3 (typically 
/3 < 10 in this setting). 

Distribution of the time lengths of reactive paths. To validate the al- 
gorithm, we compute an histogram of the distribution of the time length (dura- 
tion) of a reactive path. On Figure[2l we compare the results obtained with DNS 
and our algorithm: the agreement is excellent. The interest of our algorithm 
is that it is possible to compute this distribution for very small temperatures 
(large values of (3), for which a DNS cannot be used. 

We observe (see Figure [3|) that this distribution seems to be very close to 
an inverse Gaussian distribution with density (see also |21j): 

f ^ x{x) = \2^) 6XP { W* J X> °' 

where [i and A are two positive parameters. Inverse Gaussian distributions 
naturally appear when considering first passage time at a fixed level a > for 
a one-dimensional stochastic process Y t = vt + aWt- 



T a = inf{t>0|l r t = a}~JG(a, 



7PI 



Optimal values for the parameters (/i, A) to fit the time lengths distributions 
are provided in Table [TJ We again observe that the agreement between DNS 
and our algorithm is excellent. 

Computational time. Let us now compare the computational time required 
to simulate an ensemble of reactive paths. All the computations have been 
performed using an Intel Xeon CPU (2.50GHz), with 16 Go RAM. We use a 
Mersenne Twister algorithm [23] for the random number generator. 
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Figure 2: Distribution of the time lengths of reactive paths. In the first three 
figures we compare results computed with DNS and our algorithm. In the 
last figure, distributions of the lengths are given for various values of j3: (3 = 
1,5,10,15,20,30,40. 
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Figure 3: Distribution of the time lengths of reactive paths and the inverse 
Gaussian law fitted to this distribution, for /3 = 1, 10, 20, 40. 
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DNS (/i,A) 


Algo 0, A) 


1 


0.59,2.14 
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1.36,10.38 


10 




1.64,20.02 


15 




1.80,28.40 


20 




1.91,35.63 


40 




2.16,59.32 



Table 1: Fitted parameter on Inverse Gaussian law of the probability distribu- 
tion of the time stopping of reactive path for various values of j3. 
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Table 2: Probability a^ (see ([3])), computational time and relative variance 
(RV) for the estimators of p, for different values of /3 and number of paths N. 
DNS CPU time with * is an extrapolated time deduced from a small number 
of iterations. 

In Table El we give CPU times for various values of /3, using DNS (when 
possible) or our algorithm. The DNS time simulation rapidly explodes when f3 
increases. For (3 = 15 and N = 10 5 , the ratio between the CPU time of a DNS 
and the CPU time of our algorithm is of the order of 1 000. 



Variance of the estimators dj\r. To complete the discussion on computa- 
tional time, we also compare in Table [2] the relative variances of the estimators 
ajv of the probability p, for DNS and for our algorithm. The relative variance 
is defined as the variance divided by the mean square. With the notations of 
this table, the relative variance of the DNS estimator for a^v is estimated by 
(1 — 6ln)/N. With our algorithm, it has been proved [15] that this relative vari- 
ance can be estimated by — log(aAr)/A r . This explains why in the five last rows 
of Table [2] (where A^ = 10 5 ), the relative variance of our estimator increases 
very slowly (in fact, logarithmically) when the probability of interest decreases 
to zero. To take into account both computational time and variance, it has 
been proposed in [16] to define the efficiency of a Monte Carlo process as "in- 
versely proportional to the product of the sampling variance and the amount of 
labour expended in obtaining this estimate." Using this definition of efficiency, 
for /3 = 15 and iV = 10 5 , our algorithm is about 800 times more efficient than 
DNS. 
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3.2 A two-dimensional case with two channels 



In this section, we apply the algorithm to a two-dimensional situation, again 
with overdamped Langevin trajectories ([2]). The potential V we consider is 
taken from \lb\ \ 



(x+l) 2 -y 2 , n o™4 



V(x,y) = 3e- x2 -( y -^ 2 ■ 
-5e 



+ 0.2x^+0.2 y 



■i 



(6) 



This potential (see Figure H|) has two deep minima approximately at H± = 
(±1,0), a shallow minimum approximately at M = (0,1.5) and three saddle 
points approximately at U± = (±0.6,1.1) and L = (0,-0.4). In the notation 
of our algorithm above, A (resp. B) denotes a neighborhood of i7_ (resp. 
H + ). The two metastable regions A and B can thus be connected by two 
channels: The upper channel around the points (H-,U-,M,U+,H+) and the 
lower channel around the points (H_,L,H + ). 



V(x,y) 




Figure 4: The potential V. 



From large deviation theory [13) . it is known that in the small temperature 
limit (/3 — > oo) the reactive trajectories which will be favored will go through 
the upper channel, since the energy barrier is lower there. On the other hand, at 
higher temperature, the lower channel is also very likely, since the trajectories 
going through the upper channel visit a broader region of space, and are thus 
less favored due to an entropic effect. In other words, the lower channel is more 
narrow. This entropic switching effect is well-known and has been studied 
in 



For the numerical experiments, following [23], we use the two following 
values for the temperature: j3 = 6.67 (low temperature), which is such that 
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most of the trajectories go through the upper channel, and f3 = 1.67 (high 
temperature), which is such that most of the trajectories go through the lower 
channel. 

The region A (resp. B) is defined as the Euclidean ball B(H^, 0.05) (resp. 
B(H + , 0.05)). To discretize the dynamics ([2]), we use an Euler scheme with a 
time-step size At = 10 -2 . To test the influence of the choice of the reaction 
coordinate, we will consider two different reaction coordinates: 

£i{x,y) = x, and £ 2 (x, y) = || (x, y) - H-\\. (7) 

For the numerical experiments, we take z m \ n = 0.05 (resp. z max = 0.9) 
when the reaction coordinates is £i and z m i n = 0.05 (resp. z max = 1.5) when 
the reaction coordinates is £2- We will plot the probability density p of positions, 
conditionally on being on a reactive trajectory. The discretization of this density 
uses a regular grid of size 100 x 100 with constant x and y intervals. 

Figure [5] gives the estimation of the density p obtained with the two re- 
action coordinates ([7]), for N = 10 4 , 10 5 and for the two temperature values 
/3 = 1.67, 6.67. We observe that the numerical results are slightly better with £2 
(which is actually closer to the committor function q, see Figure 10 in [21]): in 
particular, the selection procedure retains more diversity in the paths around A 
with this reaction coordinate. However, it can also be observed that the results 
obtained with these two reaction coordinates are all consistent: one does not 
need to know a priori the committor function to get reliable results. In the 
following, we concentrate on £ = £2; which is actually an interesting generic 
reaction coordinate since it only requires the knowledge of a reference configu- 
ration in A. 

On Figure El we test the influence of the time-step size, by plotting the 
density p obtained using two time-step sizes: At = 10~ 2 (as above) and At = 
10~ 3 . It appears that the results have indeed converged for the time-step size 
At = 10" 2 . 

An important quantity computed from reactive paths is the flux of reactive 
trajectories, which is defined (up to a multiplicative constant) as |17t 124] : for 
any domain V G M. d \ (A U B), 



id 



divJ = P(a reactive trajectory enters T>)— P(a reactive trajectory leaves T>). 



On Figure we plot the flux J at the two temperature values. On the top 
figures, we use the grid of size 100 x 100. On the lower figures, we plot the flux 
on a sub-grid of size 20 x 20 with constant x and y intervals to better illustrate 
the overall direction. It is clear from these figures that at low temperature (/3 = 
6.67), the upper channel is favored, while at higher temperature (/? = 1.67), the 
lower channel is more likely. 

We would like to stress that all these results are in agreement with those 
obtained in [24] . where similar results are obtained using a different numerical 
method. 

On Figure El a few reactive paths are plotted. To quantify the fact that 
the upper or the lower channel is preferentially used by reactive paths, let us 
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Figure 5: Density p for different choice of the reaction coordinate, of j3, and of 
the number of paths N. The first column corresponds to the reaction coordinate 
£1 and the second to £2- For the first two rows, the number of paths is N = 10 4 
and for the last two rows, N = 10 5 . The value for f3 is given on each figure. 
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Figure 6: Density function p for simulations with At = 10~ 2 (left column) and 
At = 10~ 3 (right column). For this test N = 10 4 , the reaction coordinate is £2 
and the value of /3 is given on each figure. 
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Figure 7: Flux of reactive trajectories, at inverse temperature f3 = 1.67 on the 
left, and (3 = 6.67 on the right. The color indicates the norm of the flux. 
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Figure 8: A few reactive paths for j3 = 1.67 (left), j3 = 6.67 (right). 



consider X% which is the y- value of the reactive path at the first time uo such 
that the x-value of the process Xt is equal to 0. We consider that the reactive 
path goes through the upper (resp. the lower) channel if X% is larger than 
0.75 (resp. smaller than 0.25). For (3 = 6.67, the proportion of paths such that 
0.25 < Xa < 0.75 is 0.28% and the paths going through the upper (resp. the 
lower) channel is 62.55% (resp. 37.17%). For (3 = 1.67, the proportion of paths 
such that 0.25 < X% < 0.75 is 11.26% and the paths going through the upper 
(resp. the lower) channel is 31.46% (resp. 57.28%). 

Finally, we plot on Figure [9l the histogram of the time lengths of reactive 
trajectories, at the two temperatures. We observe two modes in this distribution 
when (3 = 6.67, corresponding to the two channels. These two modes overlap 
when f3 = 1.67. 



4 Computing transition times 

In this section, we would like to explain how to use the algorithm above to 
compute transition times, namely the time required to go from one metastable 
state to another one. In the notation introduced above, it is typically the 
time required to go from dA to dB. A trajectory starting from dA (with a 
distribution to be made precise) and reaching dB is called a transition path. It 
is different from a reactive path: a reactive path is only the last portion of a 
transition path, linking dA to dB without going back to A. 

The procedure proposed in this section will be tested numerically in Sec- 
tion[5]on the two cases considered above: the double- well potential of Section I3TT1 
and the two dimensional potential of Section [ 



4.1 The one-dimensional case 

Let us first explain the principle of the algorithm in the one-dimensional setting, 
for simplicity. In this case, A and B are singletons, say A = { — 1} and B = {+1} 
and the reaction coordinate is simply £(x) = x so that Y> z = {z}. One could 
think of the double-well potential considered in Section 13.11 

The time we are interesting in here is the time needed to go from A to B, 
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Figure 9: Distribution of the time lengths of reactive paths for (3 = 1.67 (top 
left), (3 = 6.67 (top right) and zoom on the two modes of the distribution for 
(3 = 6.67 (bottom). The distribution of the bottom left (resp. bottom right) 
corresponds to the time lengths of paths through the lower channel (resp the 
upper channel). 
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namely 

T = mi{t>0,X t eB} (8) 

where X% solves (JS]) with initial condition 

X = -1. 

Note that this time T is typically much larger than the time length of a reactive 

trajectory r considered above. 

To compute this time T, let us consider X.2 min = {z m in} and cut a trajectory 

which ends in B into several pieces. Starting from A, the trajectory will touch 

S 2min , within a time Tj . Then, two events may occur. Either the trajectory 

goes to B before reaching A (with probability p, and within a time T3) or it 

goes back to A before reaching B (with probability (1 — p), and within a time 

T^)- By iterating the argument and using the Markov property, one can check 

that: 

1-1 

T = Y J {T l 1 +n) + {T( + T- i ) (9) 

i=\ 

where, for i > 1, T\ (resp. T|) are identically distributed random variables, 
and I is a geometric random variable independent from (T^T^T^) with pa- 
rameter p: for i G {1, 2, . . .}, P(J = i) = (1 - p) i_1 p. In ©, ^°=l ' = ° b y 
convention. 

From ([9]) and using the fact that E(J) = 1/p, it is easy to check that 

E(T) = f - - 1 J E(Ti + T 2 ) + E(Ti + T 3 ), (10) 

where we denoted Ti = Tj 1 the time for a trajectory starting at —1 to go to 
z m ; n and Ti = T2 1 the time for a trajectory starting from z m \ n to go back to —1 
without touching 1. 

Let us now explain how we evaluate numerically each term which appears 
in flU]): 

• The average time E(Ti + T2) which corresponds to the mean time for 
a trajectory starting from A, to reach S^ min and then to go back to A 
without touching B is obtained by DNS. Indeed, since S Zmin is typically 
close to A, this is easily obtained by simple Monte Carlo simulations. 

• The average time E(Ti + T3) which corresponds to the mean time for a 
trajectory starting from A, to reach S Zmin and then to go to B without 
touching again A is obtained by a slight modification of the algorithm of 
Section [2.21 Namely, in Step 2 of the initialization procedure, we do not 
discard the beginning of the trajectory to get the whole time, starting 
from A, to reach S Zmin (taking into account all the recrossings of dA). 

• Finally, the probability p which is the probability for a trajectory starting 
from S Zmin to reach B before touching A is also obtained by a slight 
modification of the algorithm of Section 12.21 Namely, the initialization 
steps (1 to 3) are simply replaced by setting Xq = z m \ n . Then, djy defined 
by © yields an estimate of p. 
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Let us now explain how to generalize this approach to high-dimensional situa- 
tions. 

4.2 The general case 

We would like to generalize the method above to higher dimension. In such a 
case and compared to the previous section, dA (resp. dB) plays the role of —1 
(resp 1). Again, a transition path is cut into pieces: a first part of duration 
T/ between dA and S 2min , then (possibly) a first part of duration T 2 X between 
S 2min back to dA, ... until a I-th piece of duration T{ between dA and £ 2min 
and a final piece of duration T3 between X 2min and dB. 

The difficulty when generalizing the reasoning above is to keep the Marko- 
vian properties which have been used to cut the trajectories into pieces and 
obtain formula (jl(jp . namely: 

(i) The fact that the number / of trajectories from dA to S 2min before ob- 
serving the transition to dB is a geometric variable with parameter p, 
independent of the times (T^T^T^). This requires that for a trajectory 
coming from dA and reaching S 2min , the probability of reaching B before 
A does not depend on the hitting point on S 2min . 

(ii) The fact that the distribution vqa of starting points on dA and the asso- 
ciated distribution v-£ z . of first hitting points on £ 2min are "balanced" in 
the sense that: starting from uqa on dA, the first hitting points on S 2min 
are distributed according to vy,_ ; and starting from iA], on S z . , 

z min "mill min 

the distribution of first hitting points on dA is again vqa- 
These two conditions are satisfied if specific S 2 . ,vqa and za]_ are considered 

mm "min 

(see [3D] for related considerations): 

Proposition 1. Let us consider Xt solution to §2$ and the associated committor 
function q defined by (jl]). Let us assume that 

(HI) S 2min is an isocommittor surface 

namely that there exists a qo £ (0, 1) such that 

E Zmin = {x e R d , q(x) = q }. 

Then, property (i) above is satisfied. Assume moreover that: 

(H2) Xq is distributed according to dvQA = ZQA\^l\ e ~ dadA- 

Then, the distribution of first hittinq times on S 2 . is 

dw z . = Z^ 1 \Vq\e~ pv da^ . , 

'•min z min min 

and property (ii) above is satisfied. 

In this proposition, the measures oqa and a^ z . are the Lebesgue measures 
on dA and S 2min induced by the Lebesgue measure in the ambient space M. d , 
and the Euclidean scalar product. 
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Proof. Let us first start with property (i), assuming (HI). By definition of 
the committor function q (see ©), for any point x G S 2min , the probability to 
reach B before A is p (independently of x), and thus, the number of trajectories 
from dA to S 2 . before a transition to dB is a geometric random variable with 
parameter p, independent on the times (T^T^T^) introduced above. 

For property (ii), let us consider a diffusion X t solution to (J2]) with initial 
conditions Xq distributed according to vqa- The aim is to prove that, for any 
fixed test function 99 : M. d — > M., 



E(<p(X Tl )) = J <fidvx Zmin , (11) 

2 min 

where T\ = inf{£ > 0,Xt € £ 2min }. Let us denote 

/(x)=E(^(Xf f )) 

where Xf is the solution to ([2]) with starting point Xq = x, and Tf the associ- 
ated first hitting time of S 2min . Then, (jlip amounts to proving that: 

Z a j ffexp(-pV)\Vq\ da dA = z£ f fexp(-pV)\Vg\ do^, 

J *min / nun 

or equivalently 

Zq\ f fexp(-f3V)\Vq\ 2 S q{x) (dx) = Z^ min J fex V (-pV)\Vq\ 2 5 q(x) _ qo (dx), 

" (12) 

where we used the measure 5 q ^_ r (dx) defined by: 

8q(x)-r(dx) = \Vq\~ 1 da {x>q ^ =r} . 

This measure may also be seen (using the co-area formula, see for example [20|, 
Lemma 3.2]) as a conditional measure such that: for any test function tp 



ip(x)dx= / / ip 5 g ( x }_ r (dx) dr. 

J J {x,q(x)=r} 

One can check the following important derivation property (see for example [201 
Lemma 3.10]): 

— / V $q(x)-r(dx) = / div(ipX7q\Vq\- 2 )5 q{x) _ r (dx). (13) 

dr J{ x> q(x)= r } J{x,q{x)=r} 

Let us go back to the proof of (|12p . What can actually be checked is that 
I'(r) = where 

I(r) = JfeM-PV)\Vq\ 2 5 q(x) _ r (dx), 

which implies (fT2|) (consider the case <p = f = 1 to get that Zqa = Z^, z . ). 
Indeed, we have (using (f!3j) ): 



I'(r) = J div(feM-PV)Vq)5 g(x) _ r (dx) 

= J Vf ■ VqeM-PV)6 qix) _ r (dx) + J div(exp(-f3V)Vq)f6 q{x) _ r (dx). 
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The second term is zero using §%§. Likewise, the first term is zero since, for any 
test function h : R — > R, 

J h(r) J Vf ■ VqeM-PV) S q{x) _ r (dx) dr = j h o q Vf • Vqexp(-pV) 

= fvf-V(Hoq)exp(-pV) 

= - I div(Vfexp(-(3V))Hoq 

= 

where o denotes the composition operator, and if is a primitive of h. This 
concludes the proof. □ 

Thus, under assumptions (HI) and (H2), formula ()10p above holds with T 
being defined as the transition time from dA to dB and Xq being distributed 
according to VdA- Even though this measure may not appear to be very natural, 
the result should not be very sensitive to the choice of the distribution on dA 
since A is assumed to be a metastable state so that any trajectory starting 
from dA will rapidly loose the memory of its starting point after a mixing time 
within A. One simple interpretation of this measure is that it is the stationary 
measure of the Markov chain which, to one point of dA associates the next point 
on dA for the dynamics ([2]) after S^ min has been touched. Note eventually that 
the techniques presented in Section [5] to evaluate E(2~i + T2), E(Ti + T3) and p 
can also be applied in higher dimensional cases. 

A few remarks are in order. First concerning (HI), it is in general impos- 
sible to find a precise approximation of an isocommittor in high dimension. 
However, the algorithm we propose only requires such an isocommittor in a 
neighborhood of A and thus, a rough approximation may often be obtained 
by simple considerations, like defining £ 2min as the configurations at a given 
(small) distance of A. We will check below on a two-dimensional case that a 
rough approximation S^ min of an isocommittor in a neighborhood of A does not 
imply a too large error on the estimated mean transition time. Of course, this 
remains to be investigated theoretically, or numerically for more realistic test 
cases. It would be interesting to show, using the metastable features of A, that 
the time lengths of the cycles from A to A are actually close to be independent. 
This can be done for Markov chains introducing so-called regeneration times 
(see for example [27]). We are currently investigating if this approach could 
yield a justification of ([9]) without assumption (HI). Second, concerning (H2), 
a simple way to generate an ensemble of configurations distributed according to 
uqa is to subsample the Markov chains of visiting points to dA for excursions 
leaving dA, touching S Zmin , an then going back to dA (as mentioned above). 
This initialization procedure could also be used alternatively to steps 1-2-3 in 
the algorithm described in Section! 
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5 Computing transition times: numerical illustra- 
tions 

5.1 The one-dimensional case 

Let us consider again the one-dimensional double-well potential already used 
above in Section [3TTT with A = {—1}, B = {1} and £(x) = x. Recall that the 
transition time is defined as 

T = inf{t > 0,X t = 1 \X = -1}. 

Using the notation of Section 3J let us also define 

T x = mi{t >0,X t = z min | X = -1}, 

T 2 = inf{i >0,X t = -l\X = z min and Vs G (0, t), X s / 1}, 
T 3 = inf{t > 0, X t = 1 1 X = z min and Vs G (0, £), X s ^ -1}. 

We use formula (|10p to estimate E(T). 

First, in Tables and HI we compare results obtained from DNS and from 
our algorithm using (jlOp , for small values of /3 (/3 < 7) . In Table El we give the 
results of the transition time estimation for different values of /3, At and z m { n . 
Our estimation converges to the DNS estimation when At decreases. The dif- 
ference between the results obtained by DNS and by our algorithm is smaller 
when z m i n increases or /3 increases. For /3 = 7, we obtain an error of less than 
0.5% compared to a DNS. It seems that the main source of error is due to 
an underestimation of the probability to hit back A when starting from S^ min . 
This is related to the simple Euler discretization of the diffusion ([2]) . This fact 
is illustrated on Table HI where we see that the error on p decreases when z m ; n 
increases, At decreases or /3 increases. 

Notice that the confidence intervals (C.I.) given in the tables are 95% con- 
fidence intervals, estimated with 10 independent runs. 

Finally, results for larger values of /3 are given in Table [5J Note that K(T^) 
is not significant in the estimation of E(T) for large values of (3. We also check 
(see Figure [TO]) that the numerical results satisfy the theoretical asymptotic 
behaviour obtained from the large deviation theory: 

E(T) ocexp(/3Ay), (14) 

where AV = 1 is the height of the energy barrier between —1 and 1. 

5.2 The two-dimensional case 

Let us go back to the two-dimensional case already used in Section 13.21 Recall 
that the transition time is defined as 

T = inf{t > 0, X t G B \ X ~ u dA }. 

Using the notation of Section 15.11 let us also define 

T x = inf{t > 0, X t G S Zmin | X ~ v 9A }, 
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p 


At 


•2-min 


E(T) 


E(T) 


C.I. on E(T) 


Error 








(DNS) 


(algo) 


(algo) 


(%) 


1 


0.010 


-0.9 


3.634 


4.331 


[4.214,4.452] 


19.170 


1 


0.010 


-0.8 


3.634 


4.149 


[4.055,4.247] 


14.185 


1 


0.010 


-0.6 


3.634 


3.884 


[3.818,3.952] 


6.892 


1 


0.001 


-0.9 


3.634 


3.911 


[3.798,4.029] 


7.620 


1 


0.001 


-0.8 


3.634 


3.734 


[3.646, 3.825] 


2.759 


1 


0.001 


-0.6 


3.634 


3.533 


[3.466, 3.602] 


2.775 


5 


0.010 


-0.9 


185 


208.3 


[199.6,217.7] 


12.591 


5 


0.010 


-0.6 


185 


221.2 


[214.3,228.4] 


19.577 


5 


0.001 


-0.9 


185 


187.4 


[180.5, 194.8] 


1.292 


5 


0.001 


-0.6 


185 


193.2 


[188.3, 198.3] 


4.459 


7 


0.010 


-0.9 


1405 


1515 


[1468, 1565] 


7.832 


7 


0.010 


-0.6 


1405 


1642 


[1567, 1722] 


16.847 


7 


0.001 


-0.9 


1405 


1380 


[1316, 1449] 


1.808 


7 


0.001 


-0.6 


1405 


1400 


[1358, 1444] 


0.370 



Table 3: Transition times for small values of (3, with various time steps At and 
z m \ a . Reference values are computed by DNS. 



/3 


At 


^min 


Error on E(T3) 


Error on p 


1 


0.01 


-0.9 


1.73% 


9.45% 


1 


0.01 


-0.8 


2.05% 


5.81% 


1 


0.01 


-0.6 


1.76% 


3.20% 


1 


0.001 


-0.9 


0.09% 


2.84% 


1 


0.001 


-0.8 


0.19% 


1.46% 


1 


0.001 


-0.6 


0.10% 


0.17% 


5 


0.01 


-0.9 


0.97% 


4.79% 


5 


0.01 


-0.6 


0.92% 


4.23% 


5 


0.001 


-0.9 


0.39% 


1.17% 


5 


0.001 


-0.6 


0.04% 


1.35% 


7 


0.01 


-0.9 


0.54% 


4.63% 


7 


0.01 


-0.6 


0.96% 


5.97% 


7 


0.001 


-0.9 


0.21% 


1.04% 


7 


0.001 


-0.6 


0.33% 


1.17% 



Table 4: Comparison between DNS and the algorithm on the estimation of 
E(T3) and of the probability p, for small values of f3. 
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p 


V 


E(Ti + T 3 ) 


E(Ti + T 2 ) 


E(T) 


log(E(T))//3 


1 


1.350 10" 1 


0.49930 


0.47300 


3.529 


1.261000 


5 


1.240 10~ 2 


1.13658 


2.41300 


1.932 10 2 


1.052788 


7 


3.347 10" 3 


1.23931 


4.69608 


1.400 10 3 


1.034869 


10 


1.411 10" 5 


1.55896 


0.37247 


2.640 10 4 


1.018097 


15 


1.239 10~ 7 


1.70723 


0.47593 


3.842 10 6 


1.010760 


20 


1.007 10~ 9 


1.80644 


0.58504 


5.807 10 8 


1.008986 


25 


8.381 10~ 12 


1.89356 


0.70129 


8.367 10 10 


1.006007 


30 


6.558 10" 14 


1.97645 


0.82802 


1.263 10 13 


1.005560 


40 


4.043 10~ 18 


2.11230 


1.12890 


2.792 10 17 


1.004270 



Table 5: Computation of the mean transition time for various values of /3. For 
small values of /3 < 7, results come from Table O taking the best choice of At 
and £ m in- For /3 > 10, the numerical parameters are At = 0.001, N = 10 5 and 
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Figure 10: Comparison of the estimated mean transition times as a function of 
j3 with the asymptotic law from large deviation theory. 
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N 
xlO 3 



2 

10 
50 
100 
10 
50 



At 



0.01 
0.01 
0.01 
0.01 
0.001 
0.001 



p 

(algo) 



1.16 10" 2 
1.06 10" 2 

1.08 10~ 2 

1.09 10~ 2 

5.42 10" 3 

5.43 10~ 3 



C.I. on p 

(algo) 



[1.04,1-28] 10" 2 

[1.05,1.07] 10" 2 

[1.08,1.08] 10" 2 

[1.09,1.09] 10" 2 

[5.37,5.47] 10" 3 

[5.33,5.54] 10" 3 



P 

(DNS) 



1.07 10~ 2 

1.08 10" 2 
1.08 10~ 2 
1.08 10~ 2 



Table 6: Computation of the probability p of a reactive path for j3 

£ = &• 



1.67 and 



T 2 
T 3 



inf {t > 0, X t £ A | X ~ i/ Sz 
inf {t >0,X t eB\X ~ UTiz 



and VsG (0,t), X s (£ B}, 
and Vs G (0,t), I s ^ A}. 



We use again formula (|10p to estimate E(T). The two reaction coordinates £i 
and ^2 introduced in Section T3.2I will be used. In all the computations, we take 
Z miTi = 0.1, and z max = 0.9 when the reaction coordinate is £i or z max = 1.5 
when the reaction coordinate is £2- 

First, in the case £ = £2 and (3 = 1.67, p is estimated using the multilevel 
splitting algorithm and a DNS. Results are presented in Table El and show 
excellent agreement between the two estimates. We then check the consistency 
with the DNS on the transition time estimation. The results are presented in 
Table [7] for different values of the number of paths N and time-step size At. 
We again observe an excellent agreement. 

Let us now consider the case £ = £2 and /3 = 6.67. Results are given in 
Table [HJ Note that in this case, E(Ta) is not significant in the estimation of 
E(T). Figure fTT1 illustrates the convergence as N increases. 

In order to compare with results obtained via the transition path theory 
in [24] , we present in Table the estimation of the reactive rate k^B from A 
to B which, by symmetry of the problem, is related to the mean transition time 
as: kAB = 2/E(T). For /3 = 1.67, the rate computed ma the multilevel splitting 
algorithm is consistent with those predicted from DNS and transition path 
theory. For j3 = 6.67 the rate is so small that any computation via DNS would 
lead to totally unreasonable effort. The estimate obtained using the multilevel 
splitting algorithm is very close from the rate computed by transition path 
theory. 

Let us now consider the reaction coordinate £i(x, y) = x. Results are given 
in Table [TUJ We observe that even though S Zmin is in this case very far from 
an isocommittor, the results are still consistent with those obtained above. 
Accordingly with the results of Section 13.21 we note a larger variance on the 
results with this choice of the reaction coordinate, when (3 = 6.67. The main 
source of variance is in the estimation oyy of p. 
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N 


At 


E(Ti + T 2 ) 


E(Ti + T 3 ) 


E(T) 


C.I. on 


E(T) 


Relative 


xlO 3 




(DNS) 


(algo) 


(algo) 


E(T) 


(DNS) 


error (%) 


2 


0.01 


2.72 10" 1 


1.43 


24.66 


[22.48; 27.35] 


26.54 


7.61% 


10 


0.01 


2.73 10" 1 


1.56 


27.14 


[26.88; 27.41] 


26.50 


2.38% 


50 


0.01 


2.74 10" 1 


1.54 


26.66 


[26.61; 26.71] 


26.53 


0.48% 


100 


0.01 


2.73 10" 1 


1.52 


26.41 


[26.38; 26.44] 


26.50 


0.34% 


10 


0.001 


1.42 10" 1 


1.32 


27.32 


[27.10; 27.55] 


26.49 


3.05% 


50 


0.001 


1.42 10" 1 


1.30 


27.20 


[26.73; 27.70] 


26.50 


2.62% 



Table 7: Computation of the mean transition time for /3 = 1.67 and £ = £2- 



N 


At 


p 


E(Ti + T 2 ) 


E(Ti + T 3 ) 


E(T) 


C.I. on 


xlO 3 




(algo) 


(DNS) 


(algo) 


(algo) 


E(T) 


2 


0.01 


5.37 10" 8 


2.694 10" 1 


14.72 


5.02 10 6 


[4.10; 6.46] 10 B 


10 


0.01 


4.96 10~ 8 


2.694 10" 1 


15.94 


5.44 10 6 


[4.68; 6.48] 10 6 


50 


0.01 


4.79 10~ 8 


2.696 10" 1 


15.75 


5.63 10 6 


[5.35; 5.94] 10 6 


100 


0.01 


5.03 10~ 8 


2.698 10- 1 


14.71 


5.36 10 6 


[5.22; 5.50] 10 6 


10 


0.001 


4.50 10" 8 


2.036 10" 1 


14.67 


4.52 10 6 


[3.78; 5.62] 10 6 



Table 8: Computation of the mean transition time for j3 = 6.67 and £ = £2- 




Figure 11: Convergence of the mean transition time E(T) as a function of N 
for p = 6.67, At = 0.01 and f = &• 
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N 


P 


At 


kAB 


C.I. on 


xlO 3 






(algo) 


kAB 


2 


1.67 


0.01 


2.03 10" 2 


[1.83 


2.22] 10~ 2 


10 


1.67 


0.01 


1.84 10" 2 


[1.82 


1.86] 10" 2 


50 


1.67 


0.01 


1.88 10~ 2 


[1.87 


1.88] 10" 2 


100 


1.67 


0.01 


1.89 10" 2 


[1.89 


1.90] 10~ 2 


10 


1.67 


0.001 


1.83 10- 2 


[1.81 


1.84] 10~ 2 


50 


1.67 


0.001 


1.84 10" 2 


[1.80 


1.87] 10~ 2 


2 


6.67 


0.01 


9.97 10" 8 


[7.74 


12.2] 10~ 8 


10 


6.67 


0.01 


9.20 10~ 8 


[7.71 


10.7] 10~ 8 


50 


6.67 


0.01 


8.88 10~ 8 


[8.42 


9.34] 10~ 8 


100 


6.67 


0.01 


9.32 10~ 8 


[9.08 


9.57] 10" 8 


10 


6.67 


0.001 


1.11 10" 7 


[8.89 


13.2] 10" 8 



Table 9: Reaction rate £ = £2- Values from 
= 1.67 and k AB = 9.47 10" 8 for p = 6.67. 



are k A B = 1-912 10" 2 for 



N 
xlO 3 


P 


At 


V 

(algo) 


E(Ti + T 2 ) 
(DNS) 


E(Ti + T 3 ) 
(algo) 


E(T) 

(algo) 


C.I. on 

E(T) 


10 
50 
10 
50 


1.67 
1.67 
6.67 
6.67 


0.01 
0.01 
0.01 
0.01 


2.15 10" 2 
2.14 10~ 2 
2.61 10~ 7 
2.92 10" 7 


5.47 lO" 1 
5.42 10" 1 

1.82 

1.80 


1.56 
1.54 
11.22 
11.24 


26.45 

26.34 

6.96 10 6 

6.15 10 6 


[26.09; 26.82] 

[26.03; 26.67] 

[4.45; 15.9] 10 6 

[4.88; 8.32] 10 6 



Table 10: Computation of the mean transition time with £ = £1. 
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6 Conclusion and perspectives 

We presented a multiple replica algorithm to generate an ensemble of reactive 
trajectories. We illustrated its efficiency and accuracy on two test cases. We 
also proposed an estimator of the transition times. Future works are of course 
required in order to test the interest of such an approach for larger systems. 

In conclusion, we would like to mention two possible extensions of the ap- 
proach. First, in a case where only the initial metastable state A is known, 
one could think of using this algorithm to force the system to leave A (with- 
out knowing the metastable states around a priori), by using as a reaction 
coordinate the distance to a reference configuration xa in A (like £2 above). 
The paths would then be generated until they go back to A, or they reach a 
given fixed final time T. This would generate equilibrium trajectories of time 
length T, conditionally to reach a certain distance from xa- It should be an 
efficient procedure to explore the energy landscape at a fixed positive temper- 
ature. Second, it would be interesting to test an adaptive procedure which, at 
the end of the algorithm, approximates the committor function thanks to the 
reactive trajectories, and then uses this approximation as a reaction coordinate, 
iteratively. In particular, note that the isocommittors are also isolines of the 
function exp(/3V)p, where p is the density along reactive paths which seems to 
be accurately obtained by our algorithm (see Section EPj) . This should produce 
better and better results as the approximations of the isocommittors get more 
and more refined. 
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